Compression method and decompression method

ABSTRACT

A method includes: first setting a first and a second storage regions; first comparing a compression target data in a file with data in the first storage region; first creating a first compression code of the compression target data based on the data in the first storage region when a predetermined first consistency between the compression target data and the data in the first storage region is detected; second comparing the compression target data with data in the second storage region when the predetermined first consistency between the compression target data and the data in the first storage region is not detected, the compression target data being moved to the second storage region after the second comparing; and first storing the compression target data into the first storage region when a predetermined second consistency between the compression target data and the data in the second storage region is detected.

CROSS-REFERENCE TO RELATED APPLICATION

This application is a continuation of U.S. application Ser. No.14/742,257 filed Jun. 17, 2015, which is a continuation application ofInternational Application PCT/JP2012/008138 filed on Dec. 19, 2012 anddesignated the U.S., the entire contents of which are incorporatedherein by reference.

FIELD

The embodiment discussed herein is related to a compression technologyand a decompression technology for data.

BACKGROUND

A compression algorithm called LZ77 is known. LZ77 is adopted incompression file formats such as ZIP.

In LZ77, a compression process for data in a file is executedsequentially from the beginning of a compression target file. In LZ77, adata reference region (referred to as a slide window or the like) is setand data subjected to the compression process in the compression targetfile is stored sequentially in the slide window. The size of the slidewindow is set in advance. When the size of the data stored in the slidewindow exceeds the size of the slide window, the data is stored whilethe data stored beforehand in the slide window is updated.

In the compression process executed sequentially in LZ77, compressedcodes are used which are generated based on a data string (longestmatching data string) which matches longest piece of data to be executedin the compression process in the compression target file in the dataincluded in the slide window. The compressed codes are information inwhich a matching length of the longest matching data string of the slidewindow and a position in the slide window are combined.

According to LZ77, as the matching length of the longest matching datastring is longer, considerable data is expressed by one compressed code(a combination of the matching length and the position), therebyimproving a compression ratio. When the longest matching data string ofthe matching length is considerably extracted from the compressiontarget file, the compression ratio is improved. Therefore, there is atendency to improve the compression ratio when the size of the slidewindow increases. This is because a probability of specifying the datastring of which the matching length is long is improved since datamatching processing target data is found from more data in thecompression target file.

However, when the size of the slide window is enlarged to improve thecompression ratio, the number of cross-checking of the processing targetdata and the data stored in the slide window increases. For this reason,a time taken to compress the compression target file increases.

As a technical document of the related art, there is Japanese Laid-openPatent Publication No. 5-241777.

SUMMARY

According to an aspect of the invention, a method includes: firstsetting a first storage region and a second storage region in a memory;first comparing a compression target data in a file with data in thefirst storage region; first creating a first compression code of thecompression target data based on the data in the first storage regionwhen a predetermined first consistency between the compression targetdata and the data in the first storage region is detected; secondcomparing the compression target data with data in the second storageregion when the predetermined first consistency between the compressiontarget data and the data in the first storage region is not detected,the compression target data being moved to the second storage regionafter the second comparing; and first storing the compression targetdata into the first storage region when a predetermined secondconsistency between the compression target data and the data in thesecond storage region is detected.

The object and advantages of the invention will be realized and attainedby means of the elements and combinations particularly pointed out inthe claims.

It is to be understood that both the foregoing general description andthe following detailed description are exemplary and explanatory and arenot restrictive of the invention, as claimed.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a diagram illustrating an example of the flow of a compressionprocess;

FIG. 2 is a diagram illustrating the flow of a decompression process;

FIG. 3 is a diagram illustrating an example of a functionalconfiguration;

FIG. 4 is a diagram illustrating an example of a position informationtable T1;

FIG. 5 is an exemplary whole flowchart illustrating a compressionprocess;

FIG. 6 is an exemplary flowchart illustrating a reference process for astorage region A3;

FIG. 7 is an exemplary flowchart illustrating a reference process for astorage region A2;

FIG. 8 is an exemplary flowchart illustrating an updating process forthe storage region A3;

FIG. 9 is an exemplary flowchart illustrating an updating process forthe storage region A2;

FIG. 10 is an exemplary flowchart illustrating generation and writing ofcompressed data;

FIG. 11 is a diagram illustrating an example of the format of thecompressed data;

FIG. 12 is a diagram illustrating an example of a position informationtable T2;

FIG. 13 is an exemplary flowchart illustrating a decompression process;

FIG. 14 is an exemplary flowchart illustrating a reference process for astorage region B3;

FIG. 15 is an exemplary flowchart illustrating a reference process for astorage region B2;

FIG. 16 is an exemplary flowchart illustrating an updating process forthe storage region B3;

FIG. 17 is an exemplary flowchart illustrating an updating process forthe storage region B2;

FIG. 18 is an exemplary flowchart illustrating a reference process for astorage region B2;

FIG. 19 is a diagram illustrating an exemplary hardware configuration ofa computer;

FIG. 20 is a diagram illustrating an example of the configuration of aprogram operating in the computer;

FIG. 21 is a diagram illustrating an example of the configuration ofapparatuses in a system according to an embodiment;

FIG. 22 is a diagram illustrating an example of counting of the numberof extractions; and

FIG. 23 is a diagram illustrating an example of control of the sizes ofthe storage region A2 and the storage region A3.

DESCRIPTION OF EMBODIMENT

When the size of a slide window is enlarged to improve a compressionratio, the number of cross-checking of processing target data and datastored in the slide window increases. For this reason, a time taken tocompress a compression target file increases.

According to an aspect of an embodiment, to reduce a processing timetaken to execute compression is intended.

Hereinafter, an embodiment will be described.

FIG. 1 is a diagram illustrating an example of the flow of a compressionprocess. As work areas for the compression process, a storage region A1,a storage region A2, and a storage region A3 are provided in a memory.Data is read sequentially from a compression target file F1 loaded tothe storage region A1 and the read processing target data is convertedsequentially into compressed data. The obtained compressed data isstored sequentially in the storage region A4 and a compressed file F2 isgenerated based on the compressed data stored in the storage region A4.In the compression process illustrated in FIG. 1, whenever processingtarget data is read from the file F1, at least one of generation of thecompressed data according to a reference result of the storage regionA3, updating to the storage region A3 according to a reference result ofthe storage region A2, and updating to the storage region A2 isexecuted.

As described above, the generation of the compressed data is executedbased on a result of a reference process in the storage region A3 withrespect to the processing target data in the storage region A1 and aresult of a reference process to the storage region A2. A data string(longest matching data string) which matches the processing target datalongest is extracted by cross-checking (“cross-checking 1” illustratedin FIG. 1) of the processing target data and a data string in thestorage region A3. When the longest matching data string with a lengthequal to or greater than a predetermined length Lmin may be obtained bycross-checking 1, a compressed code is generated based on the length ofthe longest matching data string and the position of the longestmatching data in the storage region A3. When the length of the longestmatching data string by cross-checking 1 is shorter than thepredetermined length Lmin, the longest matching data string is extractedby cross-checking (“cross-checking 2” illustrated in FIG. 1) of theprocessing target data and a data string in the storage region A2. Whenthe longest matching data string obtained by cross-checking 2 has amatching length equal to or greater than the predetermined length Lmin,a compressed code is generated based on the length of the longestmatching data string and the position of the longest matching datastring in the storage region A2. When the matching length of the longestmatching data string obtained by cross-checking 2 is shorter than thepredetermined length Lmin, no compressed code is generated and Huffmancoding is executed on the processing target data. Then, an obtainedHuffman code becomes compressed data. In this case, the processinggarget data may be used to set a compressed code without executing theHuffman coding. A compression process other than the compression processof the embodiment may be used. Information capable of determining apre-decided method of generating the compressed data when the longestmatching data string with a matching length equal to or greater than thepredetermined length Lmin may not be obtained in either cross-checking 1or cross-checking 2 is stored, for example, in the header of acompressed file.

The generated compressed code is written on the storage region A4(“writing” illustrated in FIG. 1). The predetermined length Lmin is alength set as a threshold value indicating whether the compressed codeis generated by the compression process of the embodiment and is equalto or greater than at least 1 byte. The predetermined length Lmin is setto 3 bytes according to, for example, the format of ZIP.

The compressed data includes an identifier indicating the form of acode. For example, when the longest matching data string with thematching length equal to or greater than the predetermined length Lminmay not be obtained in either cross-checking 1 or cross-checking 2, thecompressed data includes an identifier (for example, “0”) indicating theHuffman coding (or the processing target data itself) corresponding tothe processing target data. For example, when the longest matching datalength obtained by cross-checking 1 or cross-checking 2 exceeds thepredetermined length Lmin, the compressed data includes an identifier(for example, “1”) indicating a compressed code obtained using thecompression process of the embodiment. In the compressed code including“1” in the identifier, for example, an identifier indicating thecompressed code based on the longest matching data string obtained bycross-checking 1 is set to “11” and an identifier indicating thecompressed code based on the longest matching data string obtained bycross-checking 2 is set to “10” For example, when the beginning of thecompressed data includes the identifier and the beginning bit of thecompressed data is “0”, the compressed data is determined to becompressed data when the longest matching data string with the matchinglength equal to or greater than the predetermined length Lmin may not beobtained in either cross-checking 1 or cross-checking 2. When thebeginning bit of the compressed data is “1”, it is determined that thecompressed code coded based on the longest matching data string isincluded. Further, a subsequent bit indicates the longest matching datastring by cross-checking 1 or the longest matching data string bycross-checking 2. When the subsequent bit is “0”, the compressed code isdetermined to be a compressed code based on the longest matching datastring obtained by cross-checking 2. When the subsequent bit is “1”, thecompressed code is determined to be a compressed code based on thelongest matching data string obtained by cross-checking 1.

The storage region A3 is updated based on the processing target data anddata stored in the storage region A2. For example, the longest matchingdata string extracted by cross-checking 2 is stored in the storageregion A3 (“updating 1” illustrated in FIG. 1).

The storage region A3 is, for example, a storage region in which a datasize is decided. The data size is, for example, a size from aboutseveral kilobytes to about several tens of kilobytes. For example, whennew data with a size equal to or greater than a given data size isstored, the new data is stored on old data stored in the beginning ofthe storage region A3. Further, logical beginnings are updatedsequentially according to storage of data. The data stored in thestorage region A3 is indicated by, for example, relative addresses fromwritten positions of the beginnings slid according to storage of thedata. The front and rear of a storage order between the data stored inthe storage region A3 are indicated by the relative addresses from thewritten positions of the logical beginnings.

For example, updating 1 of the above-described storage region A3 may beconfigured to be executed when a compressed code corresponding to theprocessing target data is not generated because of the fact that thelength of the longest matching data string is shorter than thepredetermined length Lmin by cross-checking 1.

The storage region A2 is updated based on the processing target dataread sequentially from the file F1. For example, the processing targetdata of the storage region A1 is written on the storage region A2(updating 2). The storage region A2 is, for example, a storage region inwhich the data size is decided. The data size is, for example, a sizefrom about several kilobytes to about several tens of kilobytes. Forexample, when new data with a size equal to or greater than a given datasize is stored in the storage region A2, the new data is stored on olddata stored in the beginning of the storage region A2.

For example, the above-described updating 2 may be executed when thestorage region A3 is not updated because of the fact that the length ofthe longest matching data string is shorter than the predeterminedlength Lmin by cross-checking 2. Further, updating 2 may be executedwhen the matching length of one of the longest matching data string bycross-checking 1 and the longest matching data string by cross-checking2 is shorter than the predetermined length Lmin.

In the example of the file F1 illustrated in FIG. 1, a character string“ . . . 1st horse . . . 2nd horse . . . 3rd horse . . . ” is included (“. . . ” is an unspecified character string).

When a portion subsequent to “h” of “1st horse” . . . is the processingtarget data, the longest matching data string of “horse” . . . isretrieved in the storage region A3. As illustrated in FIG. 1, no data isstored in the storage region A3. Therefore, data matching “h” which isthe beginning data is not present. That is, the length of the longestmatching data string by cross-checking 1 is shorter than thepredetermined length Lmin. The longest matching data string of “horse”is retrieved in the storage region A2. Since the data matching “h” isnot included in the storage region A2 either, the length of the longestmatching data string by cross-checking 2 is shorter than thepredetermined length Lmin.

In the example of FIG. 1, when a portion subsequent to “h” of “1sthorse” . . . is the processing target data, the longest matching datastring with the matching length equal to or greater than thepredetermined length Lmin may not be obtained in cross-checking 1 orcross-checking 2. Therefore, the Huffman coding of the beginning data“h” of the processing target data is executed. As described above,another compression process may be used instead of the Huffman coding orthe character data “h” may be used without change. Compressed data d1obtained based on the beginning data is written on the storage regionA4.

Since the length of the longest matching data string by cross-checking 2is shorter than the predetermined length, the updating (updating 1) ofthe storage region A3 is not executed. Since the longest matching datastring with the matching length equal to or greater than thepredetermined length Lmin may not be obtained by cross-checking 1 orcross-checking 2, the storage region A2 is updated (updating 2). Inupdating 2, the beginning data “h” of the processing target data isstored in the storage region A2.

Data of a subsequent processing target becomes “o . . . ”. Even in thecase of “orse . . . ”, coding of the Huffman coding of “o” and storageof “o” on the storage region A2 are executed as in “horse” . . . .

When a portion after “h” of “2nd horse . . . ” is the processing targetdata, the longest matching data string of “horse . . . ” is retrieved inthe storage region A3. As illustrated in FIG. 1, since no data is storedin the storage region A3, data matching “h” which is the beginning datais not present. That is, the length of the longest matching data stringby cross-checking 1 is shorter than the predetermined length Lmin.

The longest matching data string of “horse . . . ” is retrieved in thestorage region A2 (cross-checking 2). Here, “ . . . 1st horse . . . ” isalready stored in the storage region A2. For example, “h” is firstretrieved in the storage region A2. Then, whether “o”, “r”, “s”, and “e”are contiguous from the portion of “h” obtained through the retrievingin the storage region A2 is combined sequentially. In the example ofFIG. 1, the processing target data “horse . . . ” matches “horse” of “ .. . 1st horse . . . ” in the storage region A2. When the matching“horse” is the longest matching data string and the matching length isequal to or greater than the predetermined length Lmin, the compressedcode is generated based on the address and the matching length of thelongest matching data string “horse” in the storage region A2. Thecompressed data d2 including the generated compressed code is stored inthe storage region A4.

Since the matching length of the longest matching data string obtainedby cross-checking 2 is equal to or greater than the predetermined lengthLmin, the updating of the storage region A3 is executed based on thelongest matching data string (updating 1). That is, the longest matchingdata string “horse” is stored in the storage region A3.

When a portion after “h” of 3rd horse . . . ” is the processing targetdata, the longest matching data string of “horse . . . ” is retrieved inthe storage region A3 (cross-checking 1). As illustrated in FIG. 1,“horse” is already stored in the storage region A3. For example, “h” isfirst retrieved in the storage region A3. Then, whether “o”, “r”, “s”,and “e” are contiguous from the portion of “h” obtained through theretrieving in the storage region A3 is combined sequentially. In theexample of FIG. 1, the data string “horse” matches by cross-checking 1.When the matching “horse” is the longest matching data string and thematching length is equal to or greater than the predetermined lengthLmin, the compressed code is generated based on the address and thematching length of the longest matching data string “horse” in thestorage region A3. The compressed data d3 including the generatedcompressed code is stored in the storage region A4.

Since the matching length of the longest matching data string obtainedby cross-checking 1 is equal to or greater than the predetermined lengthLmin, for example, a process is executed on subsequent data withoutexecuting cross-checking 2, updating 1, and updating 2. Ofcross-checking 2, updating 1, and updating 2, only updating 2 may beconfigured to be executed.

Through the compression process illustrated in FIG. 1, the data stringappearing a plurality of times in the file F1 is stored in the storageregion A3. That is, the compressed code is generated according to aresult obtained by reference to the storage region A3 in which the datastring appearing the plurality of times in the file F1 is stored. Whenthe same data is frequently used in the file F1, there is a highpossibility of the data stored in the storage region A3 being the datafrequently used in the file F1. That is, by the cross-checking with thedata stored in the storage region A3, there is a high possibility of thematching data being also found by the slide window of normal LZ77. Thelongest matching data string in the storage region A2 is stored in thestorage region A3. Therefore, when the length of the data repeated inthe file F1 is long, there is a high probability of the data string witha long data length being extracted as the longest matching data stringeven in the storage region A3. Thus, the characteristics of thecompression using the longest matching data string are maintained.Therefore, since a compression ratio is maintained even when the size ofthe storage region A3 is reduced further than the slide window used inLZ77 and the size of the data combined with the processing target datais small, a compression speed is expected to be improved.

A compression process is executed by reading data once in the file F1.Therefore, access to a memory in which each storage region is providedis suppressed further than when a compression dictionary is generatedand a compression process is then executed based on the compressiondictionary. Further, when the storage region A2 is assumed to be updatedat a time at which the storage region A3 is not updated, the number oftimes the storage region A2 is updated is suppressed.

As a modification example of the example of FIG. 1, even when thelongest matching data string with a length equal to or greater than thepredetermined length Lmin is present in the storage region A2,compressed data including a Huffman code may be generated. In theexample of FIG. 1, when a portion subsequent to “h” of “ . . . 2nd horse. . . ” is the processing target data, the longest matching data stringwith the length equal to or greater than the predetermined length Lminhas not been obtained in cross-checking 1 and the longest matching datastring with the length equal to or greater than the predetermined lengthLmin may be obtained by cross-checking 2, as described above. In thiscase, for example, compressed data of the identifier “0” including theHuffman code of “h” is generated without using the longest matching datastring “horse” in the storage region A2. Even in this case, the updatingprocess (updating 1) for the storage region A3 based on the longestmatching data string “horse” is executed. Then, when subsequentprocessing target data is “horse . . . ”, a compressed code bycross-checking (cross-checking 1) of the storage region A3 is generated.In this modification example, since the compressed code generated by thecross-checking of the storage region A2 is not used, an identifierassigned to the compressed code generated by the cross-checking of thestorage region A3 may be set to “1”. This is because it is possible todetermine which storage region is good to be referred to in order toexecute decoding even without executing determination at the second bitsuch as “10” or “11”. In this modification example, when the same datastring appears at the second time, a compressed code based on thelongest matching data string is not generated, but a compressed codebased on the longest matching data string is generated after the thirdtime. On the other hand, as described above, the identifier is expressedwith 1 bit to determine a decompression method.

FIG. 2 is a diagram illustrating the flow of a decompression process. Aswork areas of the decompression process, a storage region B1, a storageregion B2, and a storage region B3 are provided in a memory. Thecompressed data is read sequentially from the compressed file F2 whichis a decompression target loaded to the storage region B1, and then theread processing target compressed data is converted sequentially intodecompressed data. As described above, the compressed data included inthe compressed file F2 is the compressed code by the compression processof the embodiment or the compressed code (or processing target data) bya compression process such as the Huffman coding other than theembodiment. The obtained compressed data is stored sequentially in thestorage region B4 and a decompressed file F3 is generated based on thedecompressed data stored in the storage region B4. In the decompressionprocess illustrated in FIG. 2, at least one of generation of thedecompressed data, updating to the storage region B3 according to areference result of the storage region B2, and updating to the storageregion B2 is executed whenever the processing target compressed data isread from the file F2.

When the processing target compressed data is the compressed code By thecompression process of the embodiment (for example, the identifier is“10” or “11”), the decompressed data is generated using the data storedin the storage region B2 or the storage region B3. When the compresseddata is the compressed code by a compression process such as the Huffmancoding other than the embodiment (for example, the identifier is “0”),the decompressed data is generated according to the compression process.Whether the compressed data is the compressed code by the compressionprocess of the embodiment is determined based on the identifier assignedto the compressed data.

For example, when the identifier of the processing target compresseddata is “11”, data is acquired from the storage region B3 based on theposition and the matching length of the longest matching data stringincluded in the compressed code. The acquired data is present at aposition indicated in the compressed code in the storage region B3 andis data with the matching length indicated in the compressed code. Theacquired data is decompressed data. When the identifier of theprocessing target compressed data is “10”, data which is present at aposition indicated in the compressed code in the storage region B2 andhas a length indicated in the compressed code is acquired. The acquireddata is decompressed data. When the identifier of the processing targetcompressed data is “0”, the decompressed data is generated based on thecompressed data. When the compressed data is the compressed code by acompression process other than the embodiment, the decompressed data isacquired by a decompression process corresponding to the compressionprocess. When the compressed code is not the original compressed code,the acquired data is decompressed data.

The storage region B2 and the storage region B3 are frequently updatedaccording to the decompression of each piece of compressed data. Whenthe identifier included in the compressed data is “0”, the updatingprocess (updating 1) for the storage region B3 is not executed and theupdating process (updating 2) for the storage region B2 is executed.When the identifier included in the compressed data is “10”, theupdating process (updating 2) for the storage region B2 is not executedand the updating process (updating 1) for the storage region B3 isexecuted. When updating 2 is decided to be executed according to thegeneration of the compressed code in which the identifier “10” isassigned in the compression process of FIG. 1, updating 2 is alsoexecuted according to the decompression of the compressed data to whichthe identifier “10” is assigned even at the time of the decompressionprocess. When the identifier included in the compressed data is “11”,either the updating process (updating 2) for the storage region B2 orthe updating process (updating 1) for the storage region B3 is notexecuted. When updating 2 is decided to be executed according to thegeneration of the compressed code in which the identifier “11 isassigned in the compression process of FIG. 1, updating 2 is executedaccording to the decompression of the compressed data to which theidentifier “11” is assigned even at the time of the decompressionprocess.

The storage region B3 is, for example, a storage region in which a datasize is decided. The data size is, for example, a size from aboutseveral kilobytes to about several tens of kilobytes. For example, whennew data with a size equal to or greater than a given data size isstored in the storage region B3, the new data is stored on old datastored in the beginning of the storage region B3. Logical beginnings areupdated sequentially according to the storage of the data. The datastored in the storage region B3 is indicated by, for example, relativeaddresses from written positions of the sequentially updated beginnings.The front and rear of a storage order between the data stored in thestorage region B3 are indicated by the relative addresses from thewritten positions of the logical beginnings.

The storage region B2 is also, for example, a storage region in which adata size is decided like the storage region B3. The data size is, forexample, a size from about several kilobytes to about several tens ofkilobytes. For example, when new data with a size equal to or greaterthan a given data size is stored in the storage region B2, the new datais stored on old data stored in the beginning of the storage region B2.Logical beginnings are updated sequentially according to the storage ofthe data. The data stored in the storage region B2 is indicated by, forexample, relative addresses from written positions of the sequentiallyupdated beginnings. The front and rear of a storage order between thedata stored in the storage region B2 are indicated by the relativeaddresses from the written positions of the logical beginnings.

In FIG. 2, examples of the decompression process for the compressed datad1, the compressed data d2, and the compressed data d3 exemplified inFIG. 1 are illustrated. The decompressed data included in the compressedfile F2 is decompressed in order according to the identifiers includedin the compressed data.

In reading of the compressed data d1, it is first confirmed that theidentifier is “0”. When the compressed data d1 is generated using theHuffman coding, the decompressed data “h” is generated in a decodingorder based on the Huffman coding. When the compressed data d1 includesthe data “h”, the data “h” is assumed to be decompressed data. When theidentifier is updating (updating 2 illustrated in FIG. 2) to the storageregion B2 is executed. In updating 2, the decompressed data “h” isstored in the storage region B2.

In reading of the compressed data d2, it is first confirmed that theidentifier is “10”. When the identifier is “10”, the storage region B2is referred to based on the compressed code in the compressed data d2(reference 2 illustrated in FIG. 2). In the example of FIG. 2, the datawith the matching length indicated in the compressed code is “horse”from the position indicated in the compressed code in the storage regionB2. Since the storage region B2 is updated (updating 2 illustrated inFIG. 2) based on the compressed data read earlier than the compresseddata d2 and the same state as the storage region A2 at the time of thegeneration of the compressed data d2 in the example of FIG. 1 isachieved, the same data “horse” is acquired by the designation of theposition and the length. When the identifier is “10”, the updating tothe storage region B3 (updating 1 illustrated in FIG. 2) is executed. Inupdating 1, the decompressed data “horse” is stored in the storageregion B3. When the identifier is “10”, whether updating 2 is executedis changed according to setting. When the updating of the storage regionA2 (updating 2 illustrated in FIG. 1) is executed at the time of thegeneration of the compressed data of the identifier “10” in thecompression process, the updating of the storage region B2 (updating 2illustrated in FIG. 2) is executed even in the decompression process.

In reading of the compressed data d3, it is first confirmed that theidentifier is “11”. When the identifier is “11”, the storage region B3is referred to based on the compressed code in the compressed data d3(reference 1 illustrated in FIG. 2). In the example of FIG. 2, the datawith the matching length indicated in the compressed code is “horse”from the position indicated in the compressed code in the storage regionB3. Since the storage region B3 is updated (updating 1 illustrated inFIG. 2) based on the compressed data read earlier than the compresseddata d3 and the same state as the storage region A3 at the time of thegeneration of the compressed data d3 in the example of FIG. 1 isachieved, the same data “horse” is acquired by the designation of theposition and the length. When the identifier is “11”, the updating toupdating 1 and updating 2 is executed is executed according to setting.When the updating of the storage region A2 (updating 2 illustrated inFIG. 1) is executed at the time of the generation of the compressed dataof the identifier “11” in the compression process, updating is alsoexecuted in the decompression process according to the updating executedat the time of the generation of the compressed data. In thedecompression process illustrated in FIG. 2, the updating of the storageregion B3 is executed when the compressed data is the compressed code bya compression process other than the embodiment (or the compressed datais processing target data). Therefore, by suppressing the number ofupdates without executing the updating of the slide window for eachdecompression process for the compressed data, such as LZ77, an increasein a decompression speed may be achieved.

In the modification example of the compression process described in FIG.1, the two identifiers “0” and “1” are used and the compressed codeindicating the data in the storage region A2 is not used. In adecompression process corresponding to that of the modification example,decompressed data is generated from the compressed data of theidentifier “0”, the decompressed data is stored in the storage region B2(updating 2), and the decompressed data and the data in the storageregion B2 are combined. When the longest matching data string matchingthe decompressed data with a length equal to or greater than thepredetermined length Lmin is present in the storage region B2 as theresult of the cross-checking, the longest matching data string is storedin the storage region B3 (updating 1). Thus, the storage region B3enters the same state as the storage region A3.

FIG. 3 is a diagram illustrating an example of a functionalconfiguration. A computer 1 executing the process of the embodimentincludes a storage unit 13 and at least one of a compression unit 11 anda decompression unit 12. The compression unit 11 executes a compressionprocess and the decompression unit 12 executes a decompression process.The storage unit 13 stores, for example, a compression target file F1, afile F2 obtained through the compression process, and a file F3 obtainedby decompressing the file F2. The storage unit 13 is used as a work areaof the compression unit 11 or the decompression unit 12. The compressionunit 11 includes a control unit 111, a reference unit 112, an updatingunit 113, a reference unit 114, and an updating unit 115. Thedecompression unit 12 includes a control unit 121, a reference unit 122,an updating unit 123, a reference unit 124, and an updating unit 125.

The control unit 111 realizes a compression function by controlling thereference unit 112, the updating unit 113, the reference unit 114, andthe updating unit 115. To retain data used for a process of eachfunctional unit, the control unit 111 reserves storage regions (forexample, the storage region A1, the storage region A2, the storageregion A3, and the storage region A4 described above) in the storageunit 13. The reference unit 112 executes a reference process for data inthe storage region A2 based on a data string at a read position in thestorage region A1. The updating unit 113 updates the data in the storageregion A2 according to reading of the data string at the read positionin the storage region A1. The reference unit 114 executes a referenceprocess for the storage region A3 illustrated in FIG. 1 based on thedata string at the read position in the storage region A1. The controlunit 111 generates compressed data according to a reference result inthe storage region A3 by the reference unit 114. The updating unit 115updates the storage region A3 according to a reference result of thestorage region A2 by the reference unit 112. An execution order of theprocess by each functional unit in the compression unit 11 will bedescribed below.

The control unit 121 realizes a decompression function by controllingthe reference unit 122, the updating unit 123, the reference unit 124,and the updating unit 125. To retain data used for a process of eachfunctional unit, the control unit 121 reserves storage regions (forexample, the storage region B1, the storage region B2, the storageregion B3, and the storage region B4 described above) in the storageunit 13. The reference unit 122 executes a reference process for data inthe storage region B2 based on the compressed data at a read position inthe storage region B1. The updating unit 123 updates the data in thestorage region B2 according to the compressed data at the read positionin the storage region B1. The reference unit 124 executes a referenceprocess for the storage region B3 illustrated in FIG. 2 based on a datastring at the read position in the storage region B1. The updating unit125 updates the storage region B3 according to a reference result of thestorage region B2 by the reference unit 122. An execution order of theprocess by each functional unit in the decompression unit 12 will bedescribed below.

FIG. 4 is a diagram illustrating an example of a position informationtable T1 used to manage position information of the storage regions. Theposition information table T1 is used to manage the position of eachstorage region (the storage region A1, the storage region A2, thestorage region A3, the storage region A4, and the like) in the storageunit 13 used for the compression process. The position information tableT1 includes a start position P1, an end position P2, and a read positionP3 of the storage region A1 to which the file F1 is loaded. The positioninformation table T1 includes a start position P4, an end position P5, areference position P6, and an update position P7 of the storage regionA2. The position information table T1 includes a start position P8, anend position P9, a reference position P10, and an update position P11 ofthe storage region A3. The position information table T1 includes astart position P12, an end position P13, and a write position P14 of thestorage region A4 provided for generating the file F2. An initial valueof each piece of position information stored in the position informationtable T1 is set by the control unit 111. The start position and the endposition of each storage region indicate a storage start position andend position of data to be compressed or decompressed. Since a header orthe like is excluded, for example, the initial value of the readposition P3 is the same as the start position P1. The initial values ofthe reference position P6 and the update position P7 are also the sameas the start position P4. The initial values of the reference positionP10 and the update position P11 are also the same as the start positionP8 and the write position P14 is also the same as the start positionP12.

FIG. 5 is a flowchart illustrating an order example of the compressionprocess. When the compression function is first called through anoperation of an operating system or an application program in thecomputer 1 (S101), the control unit 111 executes preprocessing (S102).The preprocessing of S102 is, for example, reservation of the storageregion A1, the storage region A2, and the storage region A3 illustratedin FIG. 1, reservation of the storage region A4 in which compressed datais stored, and setting of each piece of position information (forexample, each piece of position information illustrated in FIG. 4) ineach storage region.

When the process of S102 ends, the control unit 111 loads thecompression target file F1 to the storage region A1 (S103). The controlunit 111 sets the end position P2 based on termination of the file F1.Next, the control unit 111 causes the reference unit 114 to execute thereference process for the storage region A3 (S104).

FIG. 6 is an exemplary flowchart illustrating a reference process for astorage region A3. When an instruction of reference process is receivedin the control unit 111 (S200), the reference unit 114 sets thereference position P10, the matching length L3a, and a longest matchingposition P3a (S201). The reference position P10 and the longest matchingposition P3a are set to be the same as the start position P8 or the sameas the update position P11. The matching length L3a is set to, forexample, “0”. The reference unit 114 further sets a counter value j toan initial value (for example, j=0) (S202).

The reference unit 114 determines whether unit data in the storageregion A1 matches unit data in the storage region A3 (S203). In thisdetermination, it is determined whether the unit data at a position(P3+j) moving by the counter value j from the read position P3 of thestorage region A1 matches the unit data at a position (P10+j) moving bythe counter value j from the reference position P10 of the storageregion A3. The unit data is, for example, information regarding units ofcharacters or the like. In the embodiment, the unit data is a unitamount expressing the counter value j, the reference position P10, orthe like and is expressed by a data amount such as an increment such asan increment of the counter value j or the reference position P10.

When the unit data at the position (P3+j) matches the unit data at theposition (P10+j) (Yes in S203), the reference unit 114 increases thecounter value j (S204). The reference unit 114 further determineswhether the counter value j is equal to or greater than a predeterminedlength Lmax (S205). The predetermined length Lmax is the maximum valueset for the matching length L3a handled in the compression process ofthe embodiment. In the embodiment, since the increment of the countervalue j is a unit amount (a data amount of the unit data) for lengthexpression of the predetermined length Lmax, it may be determined in theprocess of S205 whether the counter value j is “the same” as thepredetermined length Lmax. When the matching determination is executedfor every plurality pieces of unit data rather than the matchingdetermination of S203 between the unit data and an increment of anamount according to the matching determination is executed in S204, thecounter value j may be greater than the predetermined length Lmax. Inthis case, it may be determined in S205 whether the counter value j is“equal to or greater than” the predetermined length Lmax. The sameapplies to a comparison process with the predetermined length Lmax to bedescribed below. When the data amount to be increased is the unit amountof the length expression, whether the counter value j is “the same as”the predetermined length Lmax is determined. When the data amount to beincreased is greater than the unit amount of the length expression,whether the counter value j is “equal to or greater than” thepredetermined length Lmax is determined. When the counter value j is“equal to or greater than” the predetermined length Lmax (Yes in S205),the reference unit 114 substitutes the counter value j to the matchinglength L3a and substitutes the value of the reference position P10 tothe longest matching position P3a (S206). Here, “=” of “L3a=j andP3a=P10” expressed in the process of S206 in FIG. 6 indicates asubstitution operator. When the counter value j is not equal to orgreater than the predetermined length Lmax (No in S205), the referenceunit 114 executes the determination of S203 again. S205 and S206 are anadditional order. When S204 is executed, the determination of S203 maybe necessarily executed.

When the unit data at the position (P3+j) does not match the unit dataat the position (P10+j) (No in S203), the reference unit 114 determineswhether the counter value j is greater than the matching length L3a(S207). When the counter value j is greater than the matching length L3a(Yes in S207), the reference unit 114 substitutes the counter value j tothe matching length L3a and substitutes the value of the referenceposition P10 to the longest matching position P3a (S208). When thecounter value j is equal to or less than the matching length L3a (No inS207) and the process of S208 is executed, the reference unit 114increases the reference position P10 (S209).

The reference unit 114 determines whether the reference position P10increased in S209 reaches the end position P9 (S210).

Here, “=” of “P10=P9” expressed in the process of S210 in FIG. 6indicates an equal sign. When the reference position P10 does not reachthe end position P9 (No in S210), the reference unit 114 executes theprocess of S202 again. When the reference position P10 is set to thestart position P8 in S201, the above-described process is executed.However, when the reference position P10 is set to the update positionP11, it is determined in S210 whether the reference position P10 reachesthe update position P11. When the reference position P10 is initiallyset to the update position P11 and when the reference position P10reaches the end position P9, the reference position P10 returns to thestart position P8 through the process of S209.

When the reference position P10 reaches the end position P9 (Yes inS210) and the process of S206 is executed, the reference unit 114returns the longest matching position P3a and the matching length L3a tothe control unit 111 and ends the reference process for the storageregion A3 (S211).

When the reference result by the reference unit 114 returns to thecontrol unit 111, the control unit 111 causes the reference unit 112 toexecute the reference process for the storage region A2 (S105).

FIG. 7 is an exemplary flowchart illustrating the reference process forthe storage region A2. When an instruction to refer to the storageregion A2 is given by the control unit 111 (S300), the reference unit112 determines whether the matching length L3a obtained through theprocess of the reference unit 114 is shorter than the predeterminedlength Lmin (S301). When the matching length L3a is equal to or greaterthan the predetermined length Lmin (No in S301), the reference unit 112ends the process (S314).

When the matching length L3a is shorter than the predetermined lengthLmin (Yes in S301), the reference unit 112 sets the reference positionP6, the matching length L2a, and the longest matching position P2a(S302). The reference position P6 and the longest matching position P2aare set to be the same as the start position P4 or the same as theupdate position P7. The matching length L2a is set to, for example, “0”.The reference unit 112 further sets a counter value i to an initialvalue (for example, i=0) (S303).

The reference unit 112 determines whether the unit data in the storageregion A1 matches the unit data in the storage region A2 (S304). In thisdetermination, it is determined whether the unit data at a position(P3+i) moved by the counter value i from the read position P3 of thestorage region A1 matches the unit data at a position (P6+i) moved bythe counter value i from the reference position P6 of the storage regionA2.

When the unit data at the position (P3+i) matches the unit data at theposition (P6+i) (Yes in S304), the reference unit 112 increases thecounter value i (S305). The reference unit 112 further determineswhether the counter value i is equal to or greater than a predeterminedlength Lmax (S306). The predetermined length Lmax may be the same as ordifferent from the predetermined length Lmax in FIG. 6. When the countervalue i is equal to or greater than the predetermined length Lmax (Yesin S306), the reference unit 112 substitutes the counter value i to thematching length L2a (S307) and substitutes the value of the referenceposition P6 to the longest matching position P2a (S308). Here, “=” ofexpressed in the processes of S307 and S308 in FIG. 7 indicates asubstitution operator. When the counter value i is not equal to orgreater than the predetermined length Lmax (No in S306), the referenceunit 112 executes the determination of S304 again. S306 to S308 are anadditional order. When S305 is executed, the determination of S304 maybe necessarily executed.

When the unit data at the position (P3+i) does not match the unit dataat the position (P6+i) (No in S304), the reference unit 112 determineswhether the counter value i is greater than the matching length L2a(S309). When the counter value i is greater than the matching length L2a(Yes in S309), the reference unit 112 substitutes the counter value i tothe matching length L2a (S310) and substitutes the value of thereference position P6 to the longest matching position P2a (S311). Whenthe counter value i is equal to or less than the matching length L2a (Noin S309) and the process of S311 is executed, the reference unit 112increases the reference position P6 (S312).

The reference unit 112 determines whether the increased referenceposition P6 increased in S312 is the end position P5 (S313). When thereference position P6 is set to the start position P4 in S302, theabove-described process is executed. However, when the referenceposition P6 is set to the update position P7, it is determined in S313whether the reference position P6 is the update position P7. When thereference position P6 is initially set to the update position P7 andwhen the reference position P6 is the end position P5, the referenceposition P6 returns to the start position P4 through the process ofS312. Here, “=” of “P6=P5” expressed in the process of S313 in FIG. 7indicates an equal sign. When the reference position P6 is not the endposition P5 (No in S313), the reference unit 112 executes the process ofS303 again.

When the reference position P6 is the end position P5 (Yes in S313) andthe process of S308 is executed, the reference unit 112 returns thelongest matching position P2a and the matching length L2a to the controlunit 111 and ends the reference process for the storage region A2(S314).

Next, the control unit 111 causes the updating unit 115 to execute theupdating process to the storage region A3 based on the result (thematching length L2a and the longest matching position P2a) of thereference process to the storage region A2 of S105 (S106).

FIG. 8 is an exemplary flowchart illustrating the updating process forthe storage region A3. When an instruction of the updating process ofthe storage region A3 is received from the control unit 111 (S400), theupdating unit 115 determines whether the matching length L3a obtainedthrough the process of the reference unit 114 is shorter than thepredetermined length Lmin (S401). When the matching length L3a isshorter than the predetermined length Lmin (Yes in S401), the updatingunit 115 determines whether the matching length L2a obtained through theprocess of the reference unit 112 is equal to or greater than thepredetermined length Lmin (S402). When the matching length L3a is equalto or greater than the predetermined length Lmin (No in S401) and thematching length L2a is shorter than the predetermined length Lmin (No inS402), the updating unit 115 ends the process (S406).

When the matching length L2a is equal to or greater than thepredetermined length Lmin (Yes in S402), the updating unit 115 readsdata with the matching length L2a from the longest matching position P2ain the storage region A2 (S403). The updating unit 115 stores the dataread in S403 at the update position P11 of the storage region A3 (S404).Then, the updating unit 115 updates the update position P11 according tothe storage of the data in S404 (S405). In S405, the update position P11moves by the matching length L2a. Here, “=” of “P11=P11+L2a” expressedin the process of S405 in FIG. 8 indicates a substitution operator. Whenthe process of S405 ends, the updating unit 115 ends the process (S406).

When the process of the updating unit 115 ends, the control unit 111causes the updating unit 113 to execute the updating process to thestorage region A2 (S107).

FIG. 9 is an exemplary flowchart illustrating an updating process forthe storage region A2. When an instruction of the updating process forthe storage region A2 is received from the control unit 111 (S500), theupdating unit 113 determines whether the matching length L3a obtainedthrough the process of the reference unit 114 is equal to or greaterthan the predetermined length Lmin (S501).

When the matching length L3a is equal to or greater than thepredetermined length Lmin (Yes in S501), the updating unit 113 reads thedata with the matching length L3a from the read position P3 of thestorage region A1 and stores the read data at the update position P7 ofthe storage region A2 (S502). The updating unit 113 updates the updateposition P7 according to the storage of the data in S502 (S503). InS503, the update position P7 moves by the matching length L3a. Here, “=”of “P7=P7+L3a” in the process of S503 in FIG. 9 indicates a substitutionoperator.

When the matching length L3a is not equal to or greater than thepredetermined length Lmin (No in S501), it is further determined whetherthe matching length L2a is equal to or greater than the predeterminedlength Lmin (S504). When it is determined in the determination of S504that the matching length L2a is equal to or greater than thepredetermined length Lmin (Yes in S504), the updating unit 113 reads thedata with the matching length L2a from the read position P3 of thestorage region A1 and stores the read data at the update position P7 ofthe storage region A2 (S505). The updating unit 113 updates the updateposition P7 according to the storage of the data in S505 (S506). InS506, the update position P7 moves by the matching length L2a. Here, “=”of “P7=P7+L2a” in the process of S506 in FIG. 9 indicates a substitutionoperator.

Conversely, when the matching length L2a is shorter than thepredetermined length Lmin (No in S504), the updating unit 113 reads theunit data from the read position P3 of the storage region A1 and storesthe read data at the update position P7 of the storage region A2 (S507).The updating unit 113 updates the update position P7 (S508). In S508,the update position P7 moves by the length of the unit data. The unitdata is, for example, information regarding units of characters or thelike. In the embodiment, the unit data is expressed by a data amountsuch as an increment such as an increment of the counter value j, thereference position P10, or the like. Here, “=” of “P7=P7+(unit length)”in the process of S508 in FIG. 9 indicates a substitution operator.

When the process of S503, S506, or S508 is executed, the updating unit113 ends the process (S509). When the process of the updating unit 113ends, the control unit 111 generates the compressed data and writes thegenerated compressed data on the write position P14 of the storageregion A4 (S108).

FIG. 10 is an exemplary flowchart illustrating generation and writing ofthe compressed data. When the control unit 111 starts generation andwriting processes (S600), the control unit 111 determines whether thematching length L3a obtained through the process of the reference unit114 is equal to or greater than the predetermined length Lmin (S601).

When the matching length L3a is equal to or greater than thepredetermined length Lmin (Yes in S601), the control unit 111 generatesa compressed coded based on the longest matching position P3a and thematching length L3a, assigns the identifier “11” to the generatedcompressed code, and generates compressed data (S602). The control unit111 writes the compressed data on the write position P14 generated inS603 (S603). Further, the control unit 111 updates the write positionP14 (S604). In S604, the write position P14 moves by bits of thematching length L3a+2. The matching length L3a is a write size of thecompressed code and 2 bits are a write size of the identifier assignedto the compressed code. Here, “=” of “P14=P14+L3a+2” in the process ofS604 in FIG. 10 indicates a substitution operator.

When the matching length L3a is not equal to or greater than thepredetermined length Lmin (No in S601), it is determined whether thematching length L2a is equal to or greater than the predetermined lengthLmin (S605). When it is determined in the determination of S605 that thematching length L2a is equal to or greater than the predetermined lengthLmin (Yes in S605), the control unit 111 generates the compressed codebased on the longest matching position P2a and the matching length L2a,assigns the identifier “10” to the generated compressed code, andgenerates the compressed data (S606). The control unit 111 writes thecompressed data generated in S606 on the write position P14 (S607).Further, the control unit 111 updates the write position P14 (S608). InS608, the write position P14 moves by bits of the matching length L2a+2.The matching length L2a is a write size of the compressed code and 2bits are a write size of the identifier assigned to the compressed code.Here, of “P14=P14+L2a+2” in the process of S608 in FIG. 10 indicates asubstitution operator.

Conversely, when the matching length L2a is shorter than thepredetermined length Lmin (No in S605), the control unit 111 executesthe Huffman coding on the unit data at the read position P3, assigns theidentifier “0” to the Huffman code, and generates compressed data(S609). As described above, when a compression process other than theembodiment is set to be executed as well as the Huffman coding, thecontrol unit 111 executes the compression process according to thesetting, assigns the identifier “0” to the data obtained through thecompression process, and generates the compressed data. When thecompression process is set not to be executed, the control unit 111assigns the identifier to the unit data and generates the compresseddata. The control unit 111 writes the compressed data generated in S601on the write position P14 (S607). Further, the control unit 111 updatesthe write position P14 (S611). In S611, the write position P14 moves bybits of the matching length of the generated Huffman code+1. Here, 1 bitis a write size of the identifier assigned to the compressed code. Here,“=” of “P14=P14+(unit length)+1” in the process of S611 in FIG. 10indicates a substitution operator.

When the process of S604, S608, or S611 is executed, the control unit111 ends the process (S612). The example of the compressed datagenerated in the order of FIG. 10 will be described with reference toFIG. 11. Then, the control unit 111 updates the positions of the storageregion A1 and the read position P3 (S109). In S109, the read position P3moves by an amount corresponding to the matching length L3a. When thematching length L3a is shorter than the predetermined length Lmin, theread position P3 moves by the unit length. When the matching length L3ais equal to or greater than the predetermined length Lmin, the readposition P3 moves by the matching length L3a. The control unit 111determines whether the updated read position P3 reaches the end positionP2 of the file F1 (S110). When the read position P3 does not reach theend position P2 of the file F1 (No in S110), the control unit 111executes the process of S104 again.

When the read position P3 reaches the end position P2 of the file F1(Yes in S110), the control unit 111 generates the compressed file F2based on the compressed data stored in the storage region A4 (S111).Identification information indicating that the compression process ofthe embodiment is executed is included in the header or the like of thecompressed file F2. When the process of S111 ends, the control unit 111ends the compression function (S112). For example, in S112, the controlunit 111 notifies a callee of the compression function that thecompression process ends.

Through the above-described compression process, a compression processis executed using the storage regions A3 and A2 as work areas. When datawith the length is repeated many times in the file F1, the data isstored in the storage region A3, and thus the compressed code isgenerated by searching the storage region A3. In this case, searchingthe storage region A2 is suppressed. When the compressed code is notgenerated by searching the storage region A3, searching the storageregion A2 is executed and the searching result is reflected to thestorage region A3. Therefore, the compressed code is efficientlygenerated through only the searching of the storage region A3.

FIG. 11 is a diagram illustrating an example of the format of thecompressed data. In the example of FIG. 11, the data “horse . . . ” isstored from the read position P3 in the storage region A1. In this case,(1) of FIG. 11 illustrates an example of the compressed data when bothof the matching length L3a and the matching length L2a are shorter thanthe predetermined length Lmin. Further, (2) of FIG. 11 illustrates anexample of the compressed data when the matching length L3a is shorterthan the predetermined length Lmin and the matching length L2a is equalto or greater than the predetermined length Lmin. Furthermore, (3) ofFIG. 11 illustrates an example of the compressed data when the matchinglength L3a is equal to or greater than the predetermined length Lmin.

In (1) of the example of the compressed data, the identifier “0” and theHuffman code are included. The identifier “0” indicates that thecompressed code is not the compressed code by the compression process ofthe embodiment. The Huffman code included in (1) of the example of thecompressed data is the Huffman code corresponding to the character “h”which is the unit data present at the read position P3. The compressedcode may not be the Huffman code, but may be a compressed codecompressed through another compression process different from thecompression process of the embodiment.

In (2) of the example of the compressed code, the identifier, thelongest matching position P2a, and the matching length L2a are included.The beginning bit “1” of the identifier “10” indicates the compressedcode using the compression process of the embodiment. The longestmatching position P2a indicates the presence position of data whichlongest matches the processing target data in the storage region A2. Inthe example of FIG. 11, the longest matching position P2a is expressedwith 12 bits and is thus 0x5E6. In this example, 12 bits are used toexpress the position of data in a data string configured by 1-bytecharacters. Since values 0 to 4095 may be expressed in 12-bit data, thelongest matching position P2a may express a position in a data string ofabout 4 KB. The size of the storage region A2 in which the longestmatching position P2a may express a position with 12 bits is about 4 KB,but is less than 4 KB. In other words, the number of bits of the longestmatching position P2a is decided according to the setting of the size ofthe storage region A2. For example, when the storage region A2 is 4 KB,the longest matching position P2a is expressed with 12 bits. Thematching length L2a indicates 0x5. Since the matching length L2a is 0x5,a longest matching data string in the example of FIG. 11 is 5 charactersof “horse”. Since 4 bits are used to express the matching length L2a,the matching length L2a may express up to 0xF (15 in decimal number) inthe example of FIG. 11. The number of bits expressing the matchinglength L2a is decided according to the setting of the predeterminedlength Lmax.

In (3) of the example of the compressed code, the identifier, thelongest matching position P3a, and the matching length L3a are included.The beginning bit “1” of the identifier “11” indicates the compressedcode using the compression process of the embodiment. The longestmatching position P3a indicates the presence position of data whichlongest matches the processing target data in the storage region A3. Inthe example of FIG. 11, the longest matching position P3a is expressedwith 12 bits and is thus 0x5E6. In this example, 12 bits are used toexpress the position of data in a data string configured by 1-bytecharacters. Therefore, the size of the storage region A3 is about 4 KBor less than 4 KB. The matching length L3a indicates 0x5. Since thematching length L3a is 0x5, a longest matching data string in theexample of FIG. 11 is 5 characters of “horse”. Since 4 bits are used toexpress the matching length L3a, the upper limit of the predeterminedlength Lmax may express up to 0xF (15 in decimal numbers) in the exampleof FIG. 11.

FIG. 12 is a diagram illustrating an example of a position informationtable T2 used to manage the position information of the storage region.The position information table T2 is used to manage the positions ofstorage regions (the storage region B1, the storage region B2, thestorage region B3, the storage region B4, and the like) used for thedecompression process in the management of the position in the storageunit 13. The position information table T2 includes a start position Q1,an end position Q2, and a read position Q3 of the storage region B1loaded to the file F2. The position information table T2 includes astart position Q4, an end position Q5, a reference position Q6, and anupdate position Q7 of the storage region B2. The position informationtable T2 includes a start position Q8, an end position Q9, a referenceposition Q10, and an update position Q11 of the storage region B3. Theposition information table T2 includes a start position Q12, an endposition Q13, and a write position Q14 of the storage region B4 reservedfor generating the file F3. An initial value of each piece of positioninformation stored in the position information table T2 is set by thecontrol unit 121. The start position and the end position of eachstorage region indicate a storage start position and end position ofdata to be compressed or decompressed. Since a header or the like isexcluded, for example, the initial value of the read position Q3 is thesame as the start position Q1. The initial values of the referenceposition Q6 and the update position Q7 are also the same as the startposition Q4. The initial values of the reference position Q10 and theupdate position Q11 are also the same as the start position Q8 and thewrite position Q14 is also the same as the start position Q12.

FIG. 13 is an exemplary flowchart illustrating the decompressionprocess. When the decompression function is first called through anoperation of an operating system or an application program in thecomputer 1 (S700), the control unit 121 executes preprocessing (S701).The preprocessing of S701 is, for example, reservation of the storageregion B1, the storage region B2, and the storage region B3 illustratedin FIG. 2, reservation of the storage region B4 in which decompresseddata is stored, and setting of each piece of position information (forexample, each piece of position information illustrated in FIG. 12) ineach storage region. The process of S701 may be executed only when anidentifier indicating that the compression process of the embodiment isexecuted is included in the header of the compressed file. When theprocess of S701 ends, the control unit 121 loads the decompressiontarget file F2 to the storage region B1 (S702). The control unit 121sets the end position Q2 based on termination of the file F2.

Next, referring to the data (identifier) of the read position Q3 of thestorage region B1, the control unit 121 determines whether theidentifier indicates the compressed code (the identifier “0”) by thecompression process of the embodiment or does not indicate thecompressed code (the identifier “10” or “11”) (S703). According to theformat of the compressed data exemplified in FIG. 11, since thebeginning of the compressed data is the identifier, the data at the readposition Q3 is the identifier. When the identifier is assigned toanother position in the compressed data, the control unit 121 refers tothis position in S703.

When the identifier is “0” in S703 (Yes in S703), the control unit 121generates the decompressed data and writes the generated decompresseddata at the write position Q14 of the storage region B4 (S704). Thedecompressed data generated in S704 is data obtained by decompressingthe Huffman code included in the compressed data. As described above,when the compressed data is generated using coding other than theHuffman coding, the decompression process is executed according to thecompression process. The control unit 121 delays the write position Q14of the decompressed data by the length (unit length) of the unit data.Conversely, when the identifier is “10” or “11” in S703 (No in S703),the control unit 121 causes the reference unit 124 to refer to thestorage region B3.

FIG. 14 is an exemplary flowchart illustrating a reference process forthe storage region B3. When the reference unit 124 receives aninstruction of the reference process of the storage region B3 from thecontrol unit 121 (S800), the reference unit 124 determines whether theidentifier is “11” or not (the identifier “10”) (S801). When theidentifier is “11” (Yes in S801), the reference unit 124 reads thecompressed code from the read position Q3 of the storage region B1(S802). The reference unit 124 reads the data with a length LX from aposition QX in the storage region B3 based on the position QX and thelength LX indicated in the compressed code (S803). The position QXindicates the longest matching position in the storage region A3 at thetime of the compression process. The length LX indicates the matchinglength P3a of the longest matching data string at the time of thecompression process. The reference unit 124 returns the data read inS803 to the control unit 121 after the process of S803. However, whenthe identifier is not “11” in the determination of S801 (No in S801),the process ends (S804).

FIG. 15 is an exemplary flowchart illustrating a reference process forthe storage region B2. When the reference unit 122 receives aninstruction of the reference process of the storage region B2 from thecontrol unit 121 (S900), the reference unit 122 determines whether theidentifier is “10” or not (the identifier “11”) (S901). When theidentifier is “10” (Yes in S901), the reference unit 122 reads thecompressed code from the read position Q3 of the storage region B1(S902). The reference unit 122 reads the data with the length LX fromthe position QX in the storage region B2 based on the position QX andthe length LX indicated in the compressed code (S903). The position QXindicates the longest matching position in the storage region A2 at thetime of the compression process. The length LX indicates the matchinglength P2a of the longest matching data string at the time of thecompression process. The reference unit 122 returns the data read inS903 to the control unit 121 after the process of S903. However, whenthe identifier is not “10” in the determination of S901 (No in S901),the process ends (S904).

The control unit 121 generates the decompressed data according to thereference results of S705 and S706 and writes the generated decompresseddata on the write position Q14 of the storage region B4 (S707). That is,in S707, the control unit 121 writes the data read in S803 or S903 asthe decompressed data on the write position Q14.

Further, the control unit 121 instructs the updating unit 125 of theupdating process of the storage region B3 based on the reference resultof the reference unit 122 (S708).

FIG. 16 is an exemplary flowchart illustrating an updating process forthe storage region B3. When an instruction of the updating process ofthe storage region B3 is received from the control unit 121 (S1000), theupdating unit 125 determines whether the identifier of the processingtarget compressed data is “10” (S1001). When the identifier is “10” (Yesin S1001), the updating unit 125 reads the data with the length LX fromthe position QX of the storage region B2 (S1002). The updating unit 125stores the data read in S1002 at the update position Q11 of the storageregion B3 (S1003). The updating unit 125 updates the update position Q11according to the storage of S1003 (S1004). In S1004, the update positionQ11 moves by the length LX. When the identifier is not “10” (No inS1001) and the process of S1004 is executed, the updating unit 125 endsthe process (S1005).

When the process of the updating unit 125 ends (the process of S708ends) and the process of S704 ends, the control unit 121 instructs theupdating unit 123 of the updating process of the storage region B2(S709). The processing order of the process of S705 and the process ofS706 executed by the control unit 121 may be permutated. The processingorder of the process of S707 and the process of S708 may be permutated.

FIG. 17 is an exemplary flowchart illustrating an updating process forthe storage region B2. When an instruction of the updating process ofthe storage region B2 is received by the control unit 121 (S1100), theupdating unit 123 determines whether the identifier of the processingtarget compressed data is “0” (S1101). When the identifier is “0” inS1101 (Yes in S1101), the updating unit 123 stores the decompressed datagenerated in S704 at the update position Q7 (S1102). The updating unit123 updates the update position Q7 according to the storage of S1102(S1103). In the updating of S1103, the update position Q7 moves by thelength (unit length) of the unit data.

When the identifier is not “0” in S1101 (No in S1110), the updating unit123 determines whether the identifier is “10” (S1104). When theidentifier is “10” (Yes in S1104), the data read from the storage regionB2 in S1002 is stored at the update position Q7 (S1105). The updatingunit 123 updates the update position Q7 according to the storage ofS1105 (S1106). In the updating of S1106, the update position Q7 moves bythe length (unit length) of the unit data. When the process of S1103 orS1106 is executed and the identifier is not “10” in S1104 (No in S1104),the updating unit 123 ends the process (S1107).

When the updating unit 123 ends the updating process of the storageregion B2, the control unit 121 updates the read position Q3 (S710). InS710, the read position Q3 moves by an amount according to thecompressed data. The control unit 121 determines whether the updatedread position Q3 reaches the end position Q2 of the compressed file F2(S711). When the updated read position Q3 does not reach the endposition Q2 of the compressed file F2 (No in S711), the control unit 121executes the process of S703 again.

When the read position Q3 reaches the end position Q2 of the compressedfile F2 (Yes in S711), the control unit 121 generates the decompressedfile F3 based on the decompressed data stored at the storage region B4(S712). When the process of S712 ends, the control unit 121 ends thedecompression function (S713). For example, in S713, the control unit121 notifies a callee of the decompression function that thedecompression process ends.

For the compressed data with the identifier “0”, the decompressed datais generated and is stored in the storage region B2 and the decompresseddata is combined with the data in the storage region B2 in thedecompression process corresponding to the modification example of thecompression process, as described above in the description of FIG. 2.When the longest matching data string which matches the decompresseddata by the length equal to or greater than the predetermined lengthLmin is present in the storage region B2 as the result of thecross-checking, the longest matching data string is stored in thestorage region B3. Thus, the storage region B3 enters the same state asthe storage region A3. That is, the reference unit 122 executes theprocess of combining the decompressed data obtained from the compresseddata with the identifier “0” with the data in the storage region B2, andthe updating unit 125 updates the storage region B3 according to thecross-checking result of the reference unit 122.

FIG. 18 is an exemplary flowchart illustrating a reference process for astorage region B2. When an instruction of the reference process of thestorage region B2 is received by the control unit 121 (S1200), thereference unit 122 sets the reference position Q6, the matching lengthL2b, and the longest matching position Q2b (S1201). The referenceposition Q6 and the longest matching position Q2b are set to be the sameas the start position Q4 or the same as the update position Q7. Thematching length L2b is set to, for example, “0”. The reference unit 112further sets a counter value k to an initial value (for example, k=0)(S1202).

The reference unit 122 determines whether the identifier of the data atthe position (Q3+k) moving by the counter value k from the read positionQ3 of the storage region B1 does not indicate the compressed code by thecompression process of the embodiment or not (the identifier “0”)(S1203). When the identifier at the position (Q3+k) is not thecompressed code by the compression process of the embodiment (theidentifier “0”) (Yes in S1203), the reference unit 122 determineswhether the unit data in the storage region B1 matches the unit data inthe storage region B2 (S1204). In this determination, it is determinedwhether the unit data at the position (Q3+k) moving by the counter valuek from the read position Q3 of the storage region B1 matches the unitdata at a position (Q6+k) moving by the counter value k from thereference position Q6 of the storage region B2.

When the unit data at the position (Q3+k) matches the unit data at theposition (Q6+k) (Yes in S1204), the reference unit 122 increases thecounter value k (S1205). The reference unit 122 further determineswhether the counter value k is equal to or greater than a predeterminedlength Lmax (S1206). The predetermined length Lmax is the same as thepredetermined length Lmax in FIG. 7. When the counter value k is equalto or greater than the predetermined length Lmax (Yes in S1206), thereference unit 122 substitutes the counter value k to the matchinglength L2b and substitutes the value of the reference position Q6 to thelongest matching position Q2b (S1207). Here, “=” expressed in theprocess of S1207 of FIG. 18 indicates a substitution operator. When thecounter value k is not equal to or greater than the predetermined lengthLmax (No in S1206), the reference unit 122 executes the determination ofS1203 again. S1206 and S1207 are an additional order. When S1205 isexecuted, the determination of S1203 may be necessarily executed.

When the identifier at the position (Q3+k) is “10” or “11” (No in S1203)and the unit data at the position (Q3+k) does not match the unit data atthe position (Q6+k) (No in S1204), the reference unit 122 determineswhether the counter value k is greater than the matching length L2b(S1208). When the counter value k is greater than the matching lengthL2b (Yes in S1208), the reference unit 122 substitutes the counter valuek to the matching length L2b and substitutes the value of the referenceposition Q6 to the longest matching position Q2b (S1209). When thecounter value k is equal to or less than the matching length L2b (No inS1208) and the process of S1209 is executed, the reference unit 122increases the reference position Q6 (S1210).

The reference unit 122 determines whether the reference position Q6increased in S1210 reaches the end position Q5 (S1211). When thereference position Q6 is set at the start position Q4 in S1211 and, asdescribed above, the reference position Q6 is set at the update positionQ7, it is determined in S1211 whether the reference position Q6 is theupdate position Q7. When the reference position Q6 is initially set atthe update position Q7 and the reference position Q6 reaches the endposition Q5, the reference position Q6 returns to the start position Q4through the process of S1210. Here, “=” of “Q6=Q5” expressed in theprocess of S1211 of FIG. 18 is an equal sign. When the referenceposition Q6 is not the end position Q5 (No in S1211), the reference unit122 executes the process of S1202 again.

When the reference position Q6 is the end position Q5 (Yes in S1211) andthe process of S1207 is executed, the reference unit 122 returns thelongest matching position Q2b and the matching length L2b to the controlunit 121 and the reference process of the storage region B2 ends(S1212).

When the matching length L2b obtained in the order of FIG. 18 is equalto or greater than the predetermined length Lmin, the control unit 121causes the updating unit 125 to execute the updating process for thestorage region B3. The updating unit 125 reads the data with the lengthL2b from the longest matching position Q2b in the storage region B2 andstores the read data at the update position Q14 of the storage regionB3. Through the process of the updating unit 125, in the modificationexample, the storage region B3 is maintained to be the same as thestorage region A3.

In the above-described decompression process, the decompression isexecuted while updating the storage region B3 to the same state as thestorage region A3 used in the compression process.

Hereinafter, hardware and software used in the embodiment will bedescribed.

FIG. 19 is a diagram illustrating an exemplary hardware configuration ofthe computer 1. The computer 1 includes, for example, a processor 301, arandom access memory (RAM) 302, a read-only memory (ROM) 303, a drivedevice 304, a storage medium 305, an input interface (I/F) 306, an inputdevice 307, an output interface (I/F) 308, an output device 309, acommunication interface (I/F) 310, a storage area network (SAN)interface (I/F) 311, and a bus 312. Each hardware is connected via thebus 312.

The RAM 302 is a memory device capable of executing reading and writing.For example, semiconductor memory such as a static RAM (SRAM) or adynamic RAM (DRAM) or a flash memory rather than a RAM is used. The ROM303 includes a programmable ROM (PROM). The drive device 304 is a devicethat executes at least one of reading and writing of informationrecorded on the storage medium 305. The storage medium 305 storesinformation written by the drive device 304. The storage medium 305 is,for example, a storage medium such as a hard disk, a flash memory suchas a solid state drive (SSD), a compact disc (CD), a digital versatiledisc (DVD), or a Blu-ray disc. For example, the computer 1 includes thedrive device 304 and the storage medium 305 in regard to a plurality ofkinds of storage media.

The input interface 306 is a circuit that is connected to the inputdevice 307 and delivers an input signal received from the input device307 to the processor 301. The output interface 308 is a circuit that isconnected to the output device 309 and causes the output device 309 toexecute output according to an instruction from the processor 301. Thecommunication interface 310 is a circuit that controls communication viathe network 3. The communication interface 310 is, for example, anetwork interface card (NIC). The SAN interface 311 is a circuit thatcontrols communication with a storage device connected to the computer 1via a storage area network. The SAN interface 311 is, for example, ahost bus adapter (HBA).

The input device 307 is a device that transmits an input signalaccording to an operation. The input device 307 is, for example, a keydevice such as a button mounted on a keyboard or the body of thecomputer 1 or a pointing device such as a mouse or a touch panel. Theoutput device 309 is a device that outputs information according tocontrol of the computer 1. The output device 309 is, for example, animage output device (display device) such as a display or an audiooutput device such as a speaker. For example, an input/output devicesuch as a touch screen is used as the input device 307 and the outputdevice 309. The input device 307 and the output device 309 may beintegrated with the computer 1 or may be, for example, devices connectedexternally to the computer 1 without being included in the computer 1.

For example, the processor 301 reads a program stored in the ROM 303 orthe storage medium 305 to the RAM 302 and executes a process of thecompression unit 11 or a process of the decompression unit 12 accordingto an order of the read program. At this time, the RAM 302 is used as awork area of the processor 301. The function of the storage unit 13 isrealized when the ROM 303 and the storage medium 305 store program files(an application program 24, a middleware 23, and an OS 22 to bedescribed below) and data files (the compression target file F1, thecompressed file F2, and the like) and the RAM 302 is used as a work areaof the processor 301. The program read by the processor 301 will bedescribed with reference to FIG. 22.

FIG. 20 is a diagram illustrating an example of the configuration of aprogram operating in the computer 1. In the computer 1, an operatingsystem (OS) 22 controlling a hardware group 21 (301 to 312) illustratedin FIG. 19 operates. The processor 301 operates in the order accordingto the OS 22 to control and manage the hardware group 21 so that aprocess according to the application program 24 or the middleware 23 isrealized by the hardware group 21. In the computer 1, the middleware 23or the application program 24 is read to the RAM 302 and is executed bythe processor 301.

When the compression function is called, the processor 301 executes aprocess based on at least a part of the middleware 23 or the applicationprogram 24 (executes the process by controlling the hardware group 21based on the OS 22) so that the function of the compression unit 11 isrealized. When the decompression function is called, the processor 301executes a process based on at least a part of the middleware 23 or theapplication program 24 (executes the process by controlling the hardwaregroup 21 based on the OS 22) so that the function of the decompressionunit 12 is realized. The compression function and the decompressionfunction may be each included in the application program 24 or may be apart of the middleware 23 called to be executed according to theapplication program 24.

Since a search range for extracting data matching processing target datais suppressed in the compression function of the application program 24(or the middleware 23), a load of memory access of the processor 301 issuppressed. Therefore, a time in which the work area is reserved on theRAM 302 is also reduced.

FIG. 21 is a diagram illustrating an example of the configuration ofapparatuses in a system according to the embodiment. The system in FIG.21 includes a computer 1 a, a computer 1 b, a base station 2, and anetwork 3. The computer 1 a is connected to the network 3 connected tothe computer 1 b in at least one of wireless and wired manners.

The compression unit 11 and the decompression unit 12 illustrated inFIG. 3 may be included in one of the computers 1 a and 1 b illustratedin FIG. 21. The computer 1 b may include the compression unit 11 and thecomputer 1 a may include the decompression unit 12. The computer 1 a mayinclude the compression unit 11 and the computer 1 b may include thedecompression unit 12. Both of the computers 1 a and 1 b may include thecompression unit 11 and the decompression unit 12.

In the compression process according to the embodiment, a processingtime taken to execute the compression process is reduced. In thecompression process according to the embodiment, the matchingdetermination is executed again with the data in the storage region inwhich only the data string determined to be the longest matching datastring is stored. Therefore, this advantage may be further obtained whena file is a file (for example, an address book) in which the same datastring is used repeatedly.

Hereinafter, an example of a modification example of the above-describedembodiment will be described. As well as the following modificationexample, design modification may be appropriately made within the scopeof the disclosure without departing from the gist of the disclosure.

FIG. 22 is a diagram illustrating an example of counting of the numberof extractions. The storage region A5 is a storage region in which anumerical value corresponding to each piece of unit data stored in thestorage region A3 may be stored. The numerical value corresponding toeach piece of unit data indicates the number of times data is extractedas the longest matching data string. In FIG. 22, the number of times theunit data “h” stored in the storage region A3 is extracted as a part ofthe longest matching data string is 8 times. In the storage region A5,the numerical value corresponding to each piece of unit data isexpressed with, for example, 4 bits. Then, the number of times eachpiece of unit data is extracted up to 15 times may be counted. Forexample, in S108 of FIG. 5, the number of extractions is counted byincreasing each numerical value corresponding to the read data when thecontrol unit 111 reads the data with the matching length L3a from thelongest matching position P3a in the storage region A3. When the data iswritten on the update position P11 in S404 of FIG. 8, the numericalvalue corresponding to the data at a position at which the writing isexecuted is reset.

For example, the number of extractions is used at the time of theupdating of the storage region A3. When the number of extractions of thedata at the position at which the data is stored in S404 is greater thana threshold value, the process of S405 may be executed, and then thedata may be stored in the process of S404 again.

When the same data appears many times in the file F1, the data excludedin the storage region A3 may be stored in the storage region A3 againdue to the reference to the storage region A2. However, when the numberof extractions is counted and data extracted many times is assumed to beregistered again in the storage region A3, the number of references tothe storage region A2 is suppressed.

When the structure of FIG. 22 is used in the compression process, thesame structure is also provided in the decompression process. In thedecompression process, the number of extractions in S705 of FIG. 13 iscounted. In S1003 of FIG. 16, the value indicating the number ofextractions is reset and it is further determined whether the number ofextractions is greater than the threshold value. When the number ofextractions is greater than the threshold value, the process of S1004 isexecuted. Thereafter, the process of S1003 is executed and the data isstored in the storage region A3.

FIG. 23 is a diagram illustrating an example of control of the sizes ofthe storage region A2 and the storage region A3. According to the orderof the above-described compression process, the data stored in thestorage region A3 in the initial stage of the compression process issmall. When considerable data is stored in the storage region A3, thereis a high probability of the compressed code being generated through thecompression process according to the embodiment. Therefore, when thedata is stored in the storage region A3 as early as possible, acompressed code ratio by the compression process of the embodiment inthe compressed file increases, and thus the compression ratio isexpected to be improved. When the data is considerable in the storageregion A2, the data matching a processing target data is easily found.Therefore, when the data size of the storage region A2 is large, aprobability of the data being stored fast in the storage region A3increases. That is, the initial stage of the compression process, thesize of the storage region A3 may not be large. However, when the sizeof the storage region A2 is large, the compression ratio may be expectedto be improved.

On the other hand, when the data is accumulated in the storage regionA3, the data matching the processing target data is easily extractedfrom the storage region A3. When it is easy to execute compressioncoding on the data in the file F1 by referring to the storage region A3,the compression ratio is maintained even through the updating of thestorage region A3 is not frequently executed. On the other hand, sincedata appearing only once may appear in the file F1 as well as therepeated data, the reference of the storage region A2 may beunnecessarily processed in regard to the data appearing only once.

Accordingly, for example, when data equal to or greater than thethreshold value is stored in the storage region A3 (the update positionP11 exceeds the threshold value), the control unit 111 may executecontrol such that the data size of the storage region A2 is decreased.In this case, for example, old data is deleted among the data in thestorage region A2. That is, the data corresponding to the change in thedata size from the update position P7 is deleted. FIG. 23 schematicallyillustrates a state in which the data after the update position P7 isexcluded from the storage region A2 in the stage in which the updateposition P11 does not exceed the threshold value and the storage regionA2 is in a stage in which the update position P11 exceeds the updateposition P11.

Control may be executed such that the size of the storage region A2 isdecreased and control may be executed such that the size of the storageregion A3. Thus, it is possible to reduce frequency at which the data isthe data stored in the storage region A3 and included in the longestmatching data string many times is deleted through the updating processfor the storage region A3 or the control described with reference toFIG. 22 is executed.

When the control of the sizes illustrated in FIG. 23 is executed, thecontrol unit 121 also executes the same control of the sizes on thestorage region B2 and the storage region B3. Thus, the code is convertedby referring to the same data on the compression side and thedecompression side.

A compression processing target may be a monitoring message output fromthe system as well as the data in the file. For example, processes ofcompressing monitoring messages stored sequentially in a buffer throughthe above-described compression process and storing the compressedmessages as a log file are executed. For example, compression may beexecuted in units of pages in a database or compression may be executedin units in which a plurality of pages are organized.

The data to be subjected to the above-described compression process isnot limited to character information described above. The data to besubjected to the compression process may be information with onlynumerical values and the above-described compression process may be usedfor data such as images or audio. For example, since a file considerablyincluding data obtained through audio synthesis considerably includesrepetitions in data, a compression ratio is expected to be improved by adynamic dictionary. A moving image photographed by a fixed camera alsoincludes many repetitions since images of respective frames are similar.Therefore, it is possible to obtain the same advantages as those ofdocument data or audio data by applying the above-described compressionprocess.

All examples and conditional language recited herein are intended forpedagogical purposes to aid the reader in understanding the inventionand the concepts contributed by the inventor to furthering the art, andare to be construed as being without limitation to such specificallyrecited examples and conditions, nor does the organization of suchexamples in the specification relate to a showing of the superiority andinferiority of the invention. Although the embodiment of the presentinvention has been described in detail, it should be understood that thevarious changes, substitutions, and alterations could be made heretowithout departing from the spirit and scope of the invention.

What is claimed is:
 1. A non-transitory storage medium storing a programfor causing a computer to execute a process, the process comprising:first setting a first storage region and a second storage region in amemory; first comparing a compression target data in a file with data inthe first storage region; first creating a first compression code of thecompression target data based on the data in the first storage regionwhen a predetermined first consistency between the compression targetdata and the data in the first storage region is detected; secondcomparing the compression target data with data in the second storageregion when the predetermined first consistency between the compressiontarget data and the data in the first storage region is not detected,the compression target data being moved to the second storage regionafter the second comparing; and first storing the compression targetdata into the first storage region when a predetermined secondconsistency between the compression target data and the data in thesecond storage region is detected.
 2. The storage medium according toclaim 1, wherein the predetermined first consistency indicates that adata size of longest matching data between the compression target dataand the data in the first storage region is greater than a predeterminedfirst size.
 3. The storage medium according to claim 1, wherein theprocess further includes: second creating a second compression code ofthe compression target data based on the data in the second storageregion when a predetermined second consistency between the compressiontarget data and the data in the second storage region is detected basedon the second comparing.
 4. The storage medium according to claim 3,wherein the predetermined second consistency indicates that a data sizeof longest matching data between the compression target data and thedata in the second storage region is greater than a predetermined secondsize.
 5. The storage medium according to claim 3, wherein the firstcompression code and the second compression code are the same type ofcode and include an identifier indicating being created based on one ofthe first comparing and the second comparing.
 6. The storage mediumaccording to claim 5, wherein the process further includes: thirdcreating a third compression code of the compression target data basedon a different type of coding process than a coding process of the firstcompression code and the second compression code when the predeterminedsecond consistency between the compression target data and the data inthe second storage region is not detected based on the second comparing.7. The storage medium according to claim 6, wherein the second comparingcompares the compression target data with the data in the second storageregion for a predetermined data unit, and the third creating creates thethird compression code based on the compression target data with thedata unit.
 8. The storage medium according to claim 1, wherein the firstcompression code includes position information indicating a position inthe first storage region of data portion longest matching with thecompression target data among the data in the first storage region andlength information indicating a matching length of the data portion. 9.The storage medium according to claim 3, wherein the second compressioncode includes position information indicating a position in the secondstorage region of data portion longest matching with the compressiontarget data among the data in the second storage region and lengthinformation indicating a matching length of the data portion.
 10. Thestorage medium according to claim 1, wherein the first setting sets thesecond storage region with a first size in the memory, the first storingstores the compression target data into the first storage regionaccording to position information indicating a position in the firststorage region, the position information being updated according to asize of the compression target data, and the process further includes:first changing a size of the second storage region from the first sizeto a second size smaller than the first size when a total size of datain the first storage region indicated by the position informationexceeds a predetermined size.
 11. The storage medium according to claim10, wherein the process further includes: the first changing includesdeleting data stored foremost among data stored in the second storageregion.
 12. The storage medium according to claim 10, wherein the firstsetting sets the first storage region with a third size in the memory,the process further includes: second changing a size of the firststorage region from the third size to a fourth size greater than thethird size when the size of the second storage region is changed fromthe first size to the second size by the first changing.
 13. The storagemedium according to claim 1, wherein the first setting sets the firststorage region with a third size in the memory, the first storingincludes: storing the compression target data into the first storageregion according to position information indicating a position in thefirst storage region, updating the position information according to asize of the stored compression target data, and controlling update ofthe position information to store the compression target data within thethird size, and the process further includes: counting a number ofdetecting the data in the first storage region that the predeterminedfirst consistency with the compression target data is detected; andsecond storing a stored data following the compression target data intothe first storage region when the compression target data is overwrittenthe stored data in the first storage region with the number of detectingexceeds a predetermined number according to the position information.14. A non-transitory storage medium storing a program for causing acomputer to execute a process decompressing a compressed file, theprocess comprising: reading a compressed data from the compressed file,the compressed file being created based on a first compression code, asecond compression code and a third compression code by: first setting afirst storage region and a second storage region in a first memory,first comparing a compression target data in a file with data in thefirst storage region, first creating the first compression code of thecompression target data based on the data in the first storage regionwhen a predetermined first consistency between the compression targetdata and the data in the first storage region is detected, secondcomparing the compression target data with data in the second storageregion when the predetermined first consistency between the compressiontarget data and the data in the first storage region is not detected,the compression target data being moved to the second storage regionafter the second comparing, when a predetermined second consistencybetween the compression target data and the data in the second storageregion is detected, first storing the compression target data into thefirst storage region and second creating the second compression code ofthe compression target data based on the data in the second storageregion, the first compression code and the second compression code arethe same type of code and include an identifier indicating being createdbased on one of the first comparing and the second comparing, and thirdcreating the third compression code of the compression target data basedon a different type of coding process than a coding process of the firstcompression code and the second compression code when the predeterminedsecond consistency between the compression target data and the data inthe second storage region is not detected based on the second comparing;fourth creating a first decompression data based on data in a fourthstorage region of a second memory according to the first compressioncode when it is determined that the read compressed data is the firstcompression code; fifth creating a second decompression data based ondata in a fifth storage region of the second memory according to thesecond compression code when it is determined that the read compresseddata is the second compression code, the fifth storage region being adifferent storage region from the fourth storage region; first updatingthe fourth storage region according to the second compression code andthe second decompression data when the second decompression data iscreated by the fifth creating; sixth creating a third decompression datafrom the read compressed data based on a decoding process correspondingto the coding process when it is determined that the read compresseddata is the third compression code; and third storing the thirddecompression data into the fourth storage region.
 15. A systemcomprising: a memory; and a processor coupled to the memory andconfigured to execute a process including: first setting a first storageregion and a second storage region in the memory, first comparing acompression target data in a file with data in the first storage region,first creating a first compression code of the compression target databased on the data in the first storage region when a predetermined firstconsistency between the compression target data and the data in thefirst storage region is detected, second comparing the compressiontarget data with data in the second storage region when thepredetermined first consistency between the compression target data andthe data in the first storage region is not detected, the compressiontarget data being moved to the second storage region after the secondcomparing, and first storing the compression target data into the firststorage region when a predetermined second consistency between thecompression target data and the data in the second storage region isdetected.
 16. The system according to claim 15, wherein the processfurther includes: second creating a second compression code of thecompression target data based on the data in the second storage regionwhen a predetermined second consistency between the compression targetdata and the data in the second storage region is detected based on thesecond comparing, and the first compression code and the secondcompression code are the same type of code and include an identifierindicating being created based on one of the first comparing and thesecond comparing.
 17. The system according to claim 15, wherein thefirst compression code includes position information indicating aposition in the first storage region of data portion longest matchingwith the compression target data among the data in the first storageregion and length information indicating a matching length of the dataportion.
 18. The system according to claim 15, wherein the first settingsets the second storage region with a first size in the memory, thefirst storing stores the compression target data into the first storageregion according to position information indicating a position in thefirst storage region, the position information being updated accordingto a size of the compression target data, and the process furtherincludes: first changing a size of the second storage region from thefirst size to a second size smaller than the first size when a totalsize of data in the first storage region indicated by the positioninformation exceeds a predetermined size.
 19. The system according toclaim 15, wherein the first setting sets the first storage region with athird size in the memory, the first storing includes: storing thecompression target data into the first storage region according toposition information indicating a position in the first storage region,updating the position information according to a size of the storedcompression target data, and controlling update of the positioninformation to store the compression target data within the third size,and the process further includes: counting a number of detecting thedata in the first storage region that the predetermined firstconsistency with the compression target data is detected; and secondstoring a stored data following the compression target data into thefirst storage region when the compression target data is overwritten thestored data in the first storage region with the number of detectingexceeds a predetermined number according to the position information.20. A system comprising: a first memory; and a processor coupled to thefirst memory and configured to execute a process including: reading acompressed data from a compressed file, the compressed file beingcreated based on a first compression code, a second compression code anda third compression code by: first setting a first storage region and asecond storage region in a second memory, first comparing a compressiontarget data in a file with data in the first storage region, firstcreating the first compression code of the compression target data basedon the data in the first storage region when a predetermined firstconsistency between the compression target data and the data in thefirst storage region is detected, second comparing the compressiontarget data with data in the second storage region when thepredetermined first consistency between the compression target data andthe data in the first storage region is not detected, the compressiontarget data being moved to the second storage region after the secondcomparing, when a predetermined second consistency between thecompression target data and the data in the second storage region isdetected, first storing the compression target data into the firststorage region and second creating the second compression code of thecompression target data based on the data in the second storage region,the first compression code and the second compression code are the sametype of code and include an identifier indicating being created based onone of the first comparing and the second comparing, and third creatingthe third compression code of the compression target data based on adifferent type of coding process than a coding process of the firstcompression code and the second compression code when the predeterminedsecond consistency between the compression target data and the data inthe second storage region is not detected based on the second comparing;fourth creating a first decompression data based on data in a fourthstorage region of the first memory according to the first compressioncode when it is determined that the read compressed data is the firstcompression code; fifth creating a second decompression data based ondata in a fifth storage region of the first memory according to thesecond compression code when it is determined that the read compresseddata is the second compression code, the fifth storage region being adifferent storage region from the fourth storage region; first updatingthe fourth storage region according to the second compression code andthe second decompression data when the second decompression data iscreated by the fifth creating; sixth creating a third decompression datafrom the read compressed data based on a decoding process correspondingto the coding process when it is determined that the read compresseddata is the third compression code; and third storing the thirddecompression data into the fourth storage region.